Detecting Semantically Correct Changes to Relevant Unordered Hidden Web Data

نویسندگان

  • Vladimir Kovalev
  • Sourav S. Bhowmick
چکیده

Current proposals for XML change detection use structural constraints to detect the changes and they ignore semantic constraints. Consequently, they may produce semantically incorrect changes. In this paper, we argue that the semantics of data is important for change detection. We present a semantic-conscious change detection technique for the hidden web data. In our approach we transform the unordered hidden web query results to XML format and then detect the changes between two versions of XML representation of the hidden web data by extending X-Diff, a published unordered XML change detection algorithm. By taking advantage of the semantics, we experimentally demonstrate that our change detection approach runs up to 7 times faster than X-Diff on real life hidden web data and always detect changes that are semantically more correct than those detected by existing proposals.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Outlier Detection Using Extreme Learning Machines Based on Quantum Fuzzy C-Means

One of the most important concerns of a data miner is always to have accurate and error-free data. Data that does not contain human errors and whose records are full and contain correct data. In this paper, a new learning model based on an extreme learning machine neural network is proposed for outlier detection. The function of neural networks depends on various parameters such as the structur...

متن کامل

Semantics-based Dynamic Hypermedia Adaptation using the Hidden Markov Model

Information collection, selection, structuring, and presentation design are the core considerations for general hypermedia presentation generation systems. The content collection process can be enhanced by retrieving semantically related information objects, relevant to the topic selected by an author. Once relevant information objects are available, the content selection process suggests seman...

متن کامل

Semantic Recognition of Ontology Refactoring

Ontologies are used for sharing information and are often collaboratively developed. They are adapted for different applications and domains resulting in multiple versions of an ontology that are caused by changes and refactorings. Quite often, ontology versions (or parts of them) are syntactical very different but semantically equivalent. While there is existing work on detecting syntactical a...

متن کامل

Automatic hidden-web table interpretation, conceptualization, and semantic annotation

The longstanding problem of automatic table interpretation still illudes us. Its solution would not only be an aid to table processing applications such as large volume table conversion, but would also be an aid in solving related problems such as information extraction, semantic annotation, and semi-structured data management. In this paper, we offer a solution for the common special case in w...

متن کامل

Semantically Plagiarism Detection System Using Web Services

Plagiarism is the “wrongful appropriation” and “stealing and publication” of another author's “language, thoughts, ideas, or expressions” and the representation of them as one's own original work. Plagiarism can also be hidden when text is translated from one language to another with no credit to the version, which is called crosslanguage plagiarism. Plagiarism is widely found in text, document...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005